Overview

Dataset statistics

Number of variables27
Number of observations8732
Missing cells77706
Missing cells (%)33.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 MiB
Average record size in memory247.3 B

Variable types

Numeric7
Categorical16
Boolean4

Alerts

mel_mitotic_index has constant value "0/mm^2"Constant
isic_id has a high cardinality: 8732 distinct valuesHigh cardinality
lesion_id has a high cardinality: 8370 distinct valuesHigh cardinality
patient_id has a high cardinality: 931 distinct valuesHigh cardinality
attribution is highly imbalanced (51.7%)Imbalance
benign_malignant is highly imbalanced (84.3%)Imbalance
dermoscopic_type is highly imbalanced (56.3%)Imbalance
diagnosis_confirm_type is highly imbalanced (55.1%)Imbalance
image_type is highly imbalanced (90.5%)Imbalance
mel_type is highly imbalanced (58.6%)Imbalance
nevus_type is highly imbalanced (71.4%)Imbalance
acquisition_day has 1624 (18.6%) missing valuesMissing
anatom_site_general has 226 (2.6%) missing valuesMissing
clin_size_long_diam_mm has 7906 (90.5%) missing valuesMissing
dermoscopic_type has 190 (2.2%) missing valuesMissing
diagnosis has 6998 (80.1%) missing valuesMissing
diagnosis_confirm_type has 394 (4.5%) missing valuesMissing
lesion_id has 262 (3.0%) missing valuesMissing
mel_class has 8584 (98.3%) missing valuesMissing
mel_thick_mm has 8634 (98.9%) missing valuesMissing
mel_type has 8720 (99.9%) missing valuesMissing
mel_ulcer has 8668 (99.3%) missing valuesMissing
melanocytic has 7869 (90.1%) missing valuesMissing
nevus_type has 8616 (98.7%) missing valuesMissing
patient_id has 262 (3.0%) missing valuesMissing
mel_mitotic_index has 8731 (> 99.9%) missing valuesMissing
isic_id is uniformly distributedUniform
lesion_id is uniformly distributedUniform
isic_id has unique valuesUnique

Reproduction

Analysis started2023-08-15 12:02:42.971146
Analysis finished2023-08-15 12:02:50.369341
Duration7.4 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

Distinct5127
Distinct (%)58.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2248.8217
Minimum0
Maximum5126
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size136.4 KiB
2023-08-15T14:02:50.419252image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile218
Q11091
median2182.5
Q33274
95-th percentile4689.45
Maximum5126
Range5126
Interquartile range (IQR)2183

Descriptive statistics

Standard deviation1368.8947
Coefficient of variation (CV)0.60871644
Kurtosis-0.93115469
Mean2248.8217
Median Absolute Deviation (MAD)1091.5
Skewness0.2317578
Sum19636711
Variance1873872.8
MonotonicityNot monotonic
2023-08-15T14:02:50.533204image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2
 
< 0.1%
2367 2
 
< 0.1%
2397 2
 
< 0.1%
2398 2
 
< 0.1%
2399 2
 
< 0.1%
2400 2
 
< 0.1%
2401 2
 
< 0.1%
2402 2
 
< 0.1%
2403 2
 
< 0.1%
2404 2
 
< 0.1%
Other values (5117) 8712
99.8%
ValueCountFrequency (%)
0 2
< 0.1%
1 2
< 0.1%
2 2
< 0.1%
3 2
< 0.1%
4 2
< 0.1%
5 2
< 0.1%
6 2
< 0.1%
7 2
< 0.1%
8 2
< 0.1%
9 2
< 0.1%
ValueCountFrequency (%)
5126 1
< 0.1%
5125 1
< 0.1%
5124 1
< 0.1%
5123 1
< 0.1%
5122 1
< 0.1%
5121 1
< 0.1%
5120 1
< 0.1%
5119 1
< 0.1%
5118 1
< 0.1%
5117 1
< 0.1%

isic_id
Categorical

HIGH CARDINALITY  UNIFORM  UNIQUE 

Distinct8732
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size136.4 KiB
ISIC_3079785
 
1
ISIC_4425604
 
1
ISIC_0963117
 
1
ISIC_1693686
 
1
ISIC_5435863
 
1
Other values (8727)
8727 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters104784
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8732 ?
Unique (%)100.0%

Sample

1st rowISIC_3079785
2nd rowISIC_2107859
3rd rowISIC_3443621
4th rowISIC_2368449
5th rowISIC_0094098

Common Values

ValueCountFrequency (%)
ISIC_3079785 1
 
< 0.1%
ISIC_4425604 1
 
< 0.1%
ISIC_0963117 1
 
< 0.1%
ISIC_1693686 1
 
< 0.1%
ISIC_5435863 1
 
< 0.1%
ISIC_3298596 1
 
< 0.1%
ISIC_1384413 1
 
< 0.1%
ISIC_9800787 1
 
< 0.1%
ISIC_4325003 1
 
< 0.1%
ISIC_9489038 1
 
< 0.1%
Other values (8722) 8722
99.9%

Length

2023-08-15T14:02:50.631570image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
isic_3079785 1
 
< 0.1%
isic_5911355 1
 
< 0.1%
isic_5948139 1
 
< 0.1%
isic_0942544 1
 
< 0.1%
isic_6168031 1
 
< 0.1%
isic_3550047 1
 
< 0.1%
isic_1714551 1
 
< 0.1%
isic_1497097 1
 
< 0.1%
isic_4326975 1
 
< 0.1%
isic_8954688 1
 
< 0.1%
Other values (8722) 8722
99.9%

Most occurring characters

ValueCountFrequency (%)
I 17464
16.7%
S 8732
 
8.3%
C 8732
 
8.3%
_ 8732
 
8.3%
0 6533
 
6.2%
2 6218
 
5.9%
3 6119
 
5.8%
1 6109
 
5.8%
5 6063
 
5.8%
8 6057
 
5.8%
Other values (4) 24025
22.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 61124
58.3%
Uppercase Letter 34928
33.3%
Connector Punctuation 8732
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6533
10.7%
2 6218
10.2%
3 6119
10.0%
1 6109
10.0%
5 6063
9.9%
8 6057
9.9%
4 6045
9.9%
7 6042
9.9%
6 5983
9.8%
9 5955
9.7%
Uppercase Letter
ValueCountFrequency (%)
I 17464
50.0%
S 8732
25.0%
C 8732
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8732
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 69856
66.7%
Latin 34928
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
_ 8732
12.5%
0 6533
9.4%
2 6218
8.9%
3 6119
8.8%
1 6109
8.7%
5 6063
8.7%
8 6057
8.7%
4 6045
8.7%
7 6042
8.6%
6 5983
8.6%
Latin
ValueCountFrequency (%)
I 17464
50.0%
S 8732
25.0%
C 8732
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104784
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 17464
16.7%
S 8732
 
8.3%
C 8732
 
8.3%
_ 8732
 
8.3%
0 6533
 
6.2%
2 6218
 
5.9%
3 6119
 
5.8%
1 6109
 
5.8%
5 6063
 
5.8%
8 6057
 
5.8%
Other values (4) 24025
22.9%

attribution
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.4 KiB
The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research Centre
7108 
Hospital Italiano de Buenos Aires
761 
Memorial Sloan Kettering Cancer Center
 
601
Anonymous
 
262

Length

Max length108
Median length108
Mean length93.675332
Min length9

Characters and Unicode

Total characters817973
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHospital Italiano de Buenos Aires
2nd rowHospital Italiano de Buenos Aires
3rd rowHospital Italiano de Buenos Aires
4th rowHospital Italiano de Buenos Aires
5th rowHospital Italiano de Buenos Aires

Common Values

ValueCountFrequency (%)
The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research Centre 7108
81.4%
Hospital Italiano de Buenos Aires 761
 
8.7%
Memorial Sloan Kettering Cancer Center 601
 
6.9%
Anonymous 262
 
3.0%

Length

2023-08-15T14:02:50.720001image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:50.824049image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
the 14216
14.3%
of 14216
14.3%
queensland 14216
14.3%
university 14216
14.3%
diamantina 7108
7.1%
institute 7108
7.1%
dermatology 7108
7.1%
research 7108
7.1%
centre 7108
7.1%
buenos 761
 
0.8%
Other values (10) 6311
6.3%

Most occurring characters

ValueCountFrequency (%)
e 105401
12.9%
90744
 
11.1%
n 75530
 
9.2%
t 60189
 
7.4%
a 53842
 
6.6%
i 53241
 
6.5%
s 45193
 
5.5%
r 38705
 
4.7%
o 32441
 
4.0%
l 24048
 
2.9%
Other values (24) 238639
29.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 628514
76.8%
Space Separator 90744
 
11.1%
Uppercase Letter 84499
 
10.3%
Other Punctuation 14216
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 105401
16.8%
n 75530
12.0%
t 60189
9.6%
a 53842
8.6%
i 53241
8.5%
s 45193
 
7.2%
r 38705
 
6.2%
o 32441
 
5.2%
l 24048
 
3.8%
u 22347
 
3.6%
Other values (9) 117577
18.7%
Uppercase Letter
ValueCountFrequency (%)
D 14216
16.8%
T 14216
16.8%
Q 14216
16.8%
U 14216
16.8%
C 8310
9.8%
I 7869
9.3%
R 7108
8.4%
A 1023
 
1.2%
H 761
 
0.9%
B 761
 
0.9%
Other values (3) 1803
 
2.1%
Space Separator
ValueCountFrequency (%)
90744
100.0%
Other Punctuation
ValueCountFrequency (%)
, 14216
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 713013
87.2%
Common 104960
 
12.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 105401
14.8%
n 75530
 
10.6%
t 60189
 
8.4%
a 53842
 
7.6%
i 53241
 
7.5%
s 45193
 
6.3%
r 38705
 
5.4%
o 32441
 
4.5%
l 24048
 
3.4%
u 22347
 
3.1%
Other values (22) 202076
28.3%
Common
ValueCountFrequency (%)
90744
86.5%
, 14216
 
13.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 817973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 105401
12.9%
90744
 
11.1%
n 75530
 
9.2%
t 60189
 
7.4%
a 53842
 
6.6%
i 53241
 
6.5%
s 45193
 
5.5%
r 38705
 
4.7%
o 32441
 
4.0%
l 24048
 
2.9%
Other values (24) 238639
29.2%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size136.4 KiB
CC-BY
7108 
CC-0
863 
CC-BY-NC
761 

Length

Max length8
Median length5
Mean length5.1626202
Min length4

Characters and Unicode

Total characters45080
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC-BY-NC
2nd rowCC-BY-NC
3rd rowCC-BY-NC
4th rowCC-BY-NC
5th rowCC-BY-NC

Common Values

ValueCountFrequency (%)
CC-BY 7108
81.4%
CC-0 863
 
9.9%
CC-BY-NC 761
 
8.7%

Length

2023-08-15T14:02:50.926710image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:51.034518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
cc-by 7108
81.4%
cc-0 863
 
9.9%
cc-by-nc 761
 
8.7%

Most occurring characters

ValueCountFrequency (%)
C 18225
40.4%
- 9493
21.1%
B 7869
17.5%
Y 7869
17.5%
0 863
 
1.9%
N 761
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 34724
77.0%
Dash Punctuation 9493
 
21.1%
Decimal Number 863
 
1.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 18225
52.5%
B 7869
22.7%
Y 7869
22.7%
N 761
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
- 9493
100.0%
Decimal Number
ValueCountFrequency (%)
0 863
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34724
77.0%
Common 10356
 
23.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 18225
52.5%
B 7869
22.7%
Y 7869
22.7%
N 761
 
2.2%
Common
ValueCountFrequency (%)
- 9493
91.7%
0 863
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 18225
40.4%
- 9493
21.1%
B 7869
17.5%
Y 7869
17.5%
0 863
 
1.9%
N 761
 
1.7%

acquisition_day
Real number (ℝ)

Distinct314
Distinct (%)4.4%
Missing1624
Missing (%)18.6%
Infinite0
Infinite (%)0.0%
Mean225.86593
Minimum1
Maximum1021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.4 KiB
2023-08-15T14:02:51.133242image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median174
Q3371
95-th percentile805
Maximum1021
Range1020
Interquartile range (IQR)370

Descriptive statistics

Standard deviation251.12056
Coefficient of variation (CV)1.1118125
Kurtosis0.56172615
Mean225.86593
Median Absolute Deviation (MAD)173
Skewness1.1107538
Sum1605455
Variance63061.535
MonotonicityNot monotonic
2023-08-15T14:02:51.245621image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 2776
31.8%
176 161
 
1.8%
348 160
 
1.8%
183 155
 
1.8%
116 92
 
1.1%
488 86
 
1.0%
661 84
 
1.0%
358 84
 
1.0%
372 83
 
1.0%
387 80
 
0.9%
Other values (304) 3347
38.3%
(Missing) 1624
18.6%
ValueCountFrequency (%)
1 2776
31.8%
5 1
 
< 0.1%
50 1
 
< 0.1%
74 1
 
< 0.1%
85 1
 
< 0.1%
92 11
 
0.1%
94 2
 
< 0.1%
96 19
 
0.2%
99 1
 
< 0.1%
104 12
 
0.1%
ValueCountFrequency (%)
1021 14
0.2%
1017 11
0.1%
995 9
0.1%
986 9
0.1%
982 9
0.1%
979 12
0.1%
967 16
0.2%
964 4
 
< 0.1%
961 3
 
< 0.1%
959 6
 
0.1%

age_approx
Real number (ℝ)

Distinct15
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.010994
Minimum15
Maximum85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.4 KiB
2023-08-15T14:02:51.340218image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile30
Q140
median50
Q365
95-th percentile75
Maximum85
Range70
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.27569
Coefficient of variation (CV)0.27447447
Kurtosis-0.64190445
Mean52.010994
Median Absolute Deviation (MAD)10
Skewness-0.117077
Sum454160
Variance203.79532
MonotonicityNot monotonic
2023-08-15T14:02:51.426124image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
45 1176
13.5%
60 1063
12.2%
65 1023
11.7%
50 1021
11.7%
55 943
10.8%
70 746
8.5%
35 658
7.5%
40 657
7.5%
30 523
6.0%
75 306
 
3.5%
Other values (5) 616
7.1%
ValueCountFrequency (%)
15 2
 
< 0.1%
20 149
 
1.7%
25 230
 
2.6%
30 523
6.0%
35 658
7.5%
40 657
7.5%
45 1176
13.5%
50 1021
11.7%
55 943
10.8%
60 1063
12.2%
ValueCountFrequency (%)
85 64
 
0.7%
80 171
 
2.0%
75 306
 
3.5%
70 746
8.5%
65 1023
11.7%
60 1063
12.2%
55 943
10.8%
50 1021
11.7%
45 1176
13.5%
40 657
7.5%
Distinct8
Distinct (%)0.1%
Missing226
Missing (%)2.6%
Memory size136.4 KiB
posterior torso
2652 
lower extremity
1843 
upper extremity
1571 
anterior torso
1385 
head/neck
775 
Other values (3)
280 

Length

Max length15
Median length15
Mean length14.218199
Min length9

Characters and Unicode

Total characters120940
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowlower extremity
2nd rowhead/neck
3rd rowhead/neck
4th rowhead/neck
5th rowposterior torso

Common Values

ValueCountFrequency (%)
posterior torso 2652
30.4%
lower extremity 1843
21.1%
upper extremity 1571
18.0%
anterior torso 1385
15.9%
head/neck 775
 
8.9%
lateral torso 249
 
2.9%
palms/soles 24
 
0.3%
oral/genital 7
 
0.1%
(Missing) 226
 
2.6%

Length

2023-08-15T14:02:51.527446image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:51.649059image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
torso 4286
26.4%
extremity 3414
21.1%
posterior 2652
16.4%
lower 1843
11.4%
upper 1571
 
9.7%
anterior 1385
 
8.5%
head/neck 775
 
4.8%
lateral 249
 
1.5%
palms/soles 24
 
0.1%
oral/genital 7
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
r 19444
16.1%
o 17135
14.2%
e 16109
13.3%
t 15407
12.7%
7700
 
6.4%
i 7458
 
6.2%
s 7010
 
5.8%
p 5818
 
4.8%
m 3438
 
2.8%
x 3414
 
2.8%
Other values (12) 18007
14.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 112434
93.0%
Space Separator 7700
 
6.4%
Other Punctuation 806
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 19444
17.3%
o 17135
15.2%
e 16109
14.3%
t 15407
13.7%
i 7458
 
6.6%
s 7010
 
6.2%
p 5818
 
5.2%
m 3438
 
3.1%
x 3414
 
3.0%
y 3414
 
3.0%
Other values (10) 13787
12.3%
Space Separator
ValueCountFrequency (%)
7700
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 806
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 112434
93.0%
Common 8506
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 19444
17.3%
o 17135
15.2%
e 16109
14.3%
t 15407
13.7%
i 7458
 
6.6%
s 7010
 
6.2%
p 5818
 
5.2%
m 3438
 
3.1%
x 3414
 
3.0%
y 3414
 
3.0%
Other values (10) 13787
12.3%
Common
ValueCountFrequency (%)
7700
90.5%
/ 806
 
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 120940
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 19444
16.1%
o 17135
14.2%
e 16109
13.3%
t 15407
12.7%
7700
 
6.4%
i 7458
 
6.2%
s 7010
 
5.8%
p 5818
 
4.8%
m 3438
 
2.8%
x 3414
 
2.8%
Other values (12) 18007
14.9%

benign_malignant
Categorical

Distinct4
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size136.4 KiB
benign
8296 
malignant
 
389
indeterminate/benign
 
32
indeterminate/malignant
 
14

Length

Max length23
Median length6
Mean length6.2122323
Min length6

Characters and Unicode

Total characters54239
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowbenign
2nd rowmalignant
3rd rowbenign
4th rowmalignant
5th rowmalignant

Common Values

ValueCountFrequency (%)
benign 8296
95.0%
malignant 389
 
4.5%
indeterminate/benign 32
 
0.4%
indeterminate/malignant 14
 
0.2%
(Missing) 1
 
< 0.1%

Length

2023-08-15T14:02:51.761005image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:51.865983image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
benign 8296
95.0%
malignant 389
 
4.5%
indeterminate/benign 32
 
0.4%
indeterminate/malignant 14
 
0.2%

Most occurring characters

ValueCountFrequency (%)
n 17554
32.4%
i 8823
16.3%
g 8731
16.1%
e 8466
15.6%
b 8328
15.4%
a 852
 
1.6%
t 495
 
0.9%
m 449
 
0.8%
l 403
 
0.7%
d 46
 
0.1%
Other values (2) 92
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 54193
99.9%
Other Punctuation 46
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 17554
32.4%
i 8823
16.3%
g 8731
16.1%
e 8466
15.6%
b 8328
15.4%
a 852
 
1.6%
t 495
 
0.9%
m 449
 
0.8%
l 403
 
0.7%
d 46
 
0.1%
Other Punctuation
ValueCountFrequency (%)
/ 46
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 54193
99.9%
Common 46
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 17554
32.4%
i 8823
16.3%
g 8731
16.1%
e 8466
15.6%
b 8328
15.4%
a 852
 
1.6%
t 495
 
0.9%
m 449
 
0.8%
l 403
 
0.7%
d 46
 
0.1%
Common
ValueCountFrequency (%)
/ 46
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54239
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 17554
32.4%
i 8823
16.3%
g 8731
16.1%
e 8466
15.6%
b 8328
15.4%
a 852
 
1.6%
t 495
 
0.9%
m 449
 
0.8%
l 403
 
0.7%
d 46
 
0.1%
Other values (2) 92
 
0.2%

clin_size_long_diam_mm
Real number (ℝ)

Distinct135
Distinct (%)16.3%
Missing7906
Missing (%)90.5%
Infinite0
Infinite (%)0.0%
Mean6.3772397
Minimum1.1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.4 KiB
2023-08-15T14:02:51.972132image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1.1
5-th percentile2.8
Q14.3
median5.7
Q37.675
95-th percentile11.85
Maximum20
Range18.9
Interquartile range (IQR)3.375

Descriptive statistics

Standard deviation3.0393205
Coefficient of variation (CV)0.47658872
Kurtosis3.4139821
Mean6.3772397
Median Absolute Deviation (MAD)1.6
Skewness1.5376735
Sum5267.6
Variance9.2374692
MonotonicityNot monotonic
2023-08-15T14:02:52.084351image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.1 23
 
0.3%
5 20
 
0.2%
4.8 20
 
0.2%
4.4 18
 
0.2%
6 17
 
0.2%
5.9 16
 
0.2%
5.6 16
 
0.2%
5.4 16
 
0.2%
5.3 16
 
0.2%
4.9 16
 
0.2%
Other values (125) 648
 
7.4%
(Missing) 7906
90.5%
ValueCountFrequency (%)
1.1 1
 
< 0.1%
1.4 2
< 0.1%
1.5 2
< 0.1%
1.6 1
 
< 0.1%
1.7 1
 
< 0.1%
1.8 3
< 0.1%
1.9 3
< 0.1%
2.1 3
< 0.1%
2.2 4
< 0.1%
2.3 3
< 0.1%
ValueCountFrequency (%)
20 2
< 0.1%
19.7 1
< 0.1%
19.4 1
< 0.1%
19.1 2
< 0.1%
19 1
< 0.1%
17.9 2
< 0.1%
17.5 1
< 0.1%
17.2 2
< 0.1%
17 1
< 0.1%
16 1
< 0.1%

dermoscopic_type
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing190
Missing (%)2.2%
Memory size136.4 KiB
contact non-polarized
7213 
contact polarized
1206 
non-contact polarized
 
123

Length

Max length21
Median length21
Mean length20.435261
Min length17

Characters and Unicode

Total characters174558
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcontact polarized
2nd rowcontact polarized
3rd rowcontact polarized
4th rowcontact polarized
5th rowcontact polarized

Common Values

ValueCountFrequency (%)
contact non-polarized 7213
82.6%
contact polarized 1206
 
13.8%
non-contact polarized 123
 
1.4%
(Missing) 190
 
2.2%

Length

2023-08-15T14:02:52.200577image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:52.313612image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
contact 8419
49.3%
non-polarized 7213
42.2%
polarized 1329
 
7.8%
non-contact 123
 
0.7%

Most occurring characters

ValueCountFrequency (%)
o 24420
14.0%
n 23214
13.3%
c 17084
9.8%
t 17084
9.8%
a 17084
9.8%
8542
 
4.9%
p 8542
 
4.9%
l 8542
 
4.9%
r 8542
 
4.9%
i 8542
 
4.9%
Other values (4) 32962
18.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 158680
90.9%
Space Separator 8542
 
4.9%
Dash Punctuation 7336
 
4.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 24420
15.4%
n 23214
14.6%
c 17084
10.8%
t 17084
10.8%
a 17084
10.8%
p 8542
 
5.4%
l 8542
 
5.4%
r 8542
 
5.4%
i 8542
 
5.4%
z 8542
 
5.4%
Other values (2) 17084
10.8%
Space Separator
ValueCountFrequency (%)
8542
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7336
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 158680
90.9%
Common 15878
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 24420
15.4%
n 23214
14.6%
c 17084
10.8%
t 17084
10.8%
a 17084
10.8%
p 8542
 
5.4%
l 8542
 
5.4%
r 8542
 
5.4%
i 8542
 
5.4%
z 8542
 
5.4%
Other values (2) 17084
10.8%
Common
ValueCountFrequency (%)
8542
53.8%
- 7336
46.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 174558
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 24420
14.0%
n 23214
13.3%
c 17084
9.8%
t 17084
9.8%
a 17084
9.8%
8542
 
4.9%
p 8542
 
4.9%
l 8542
 
4.9%
r 8542
 
4.9%
i 8542
 
4.9%
Other values (4) 32962
18.9%

diagnosis
Categorical

Distinct20
Distinct (%)1.2%
Missing6998
Missing (%)80.1%
Memory size136.4 KiB
nevus
991 
melanoma
274 
seborrheic keratosis
132 
basal cell carcinoma
 
82
lentigo NOS
 
51
Other values (15)
204 

Length

Max length34
Median length5
Mean length9.2064591
Min length4

Characters and Unicode

Total characters15964
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)0.2%

Sample

1st rownevus
2nd rowmelanoma
3rd rowsolar lentigo
4th rowmelanoma
5th rowmelanoma

Common Values

ValueCountFrequency (%)
nevus 991
 
11.3%
melanoma 274
 
3.1%
seborrheic keratosis 132
 
1.5%
basal cell carcinoma 82
 
0.9%
lentigo NOS 51
 
0.6%
squamous cell carcinoma 45
 
0.5%
actinic keratosis 30
 
0.3%
atypical melanocytic proliferation 28
 
0.3%
lichenoid keratosis 27
 
0.3%
dermatofibroma 26
 
0.3%
Other values (10) 48
 
0.5%
(Missing) 6998
80.1%

Length

2023-08-15T14:02:52.408532image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
nevus 991
42.7%
melanoma 276
 
11.9%
keratosis 189
 
8.1%
seborrheic 132
 
5.7%
cell 128
 
5.5%
carcinoma 127
 
5.5%
basal 82
 
3.5%
lentigo 64
 
2.8%
nos 51
 
2.2%
squamous 45
 
1.9%
Other values (21) 237
 
10.2%

Most occurring characters

ValueCountFrequency (%)
e 2052
12.9%
s 1731
10.8%
n 1593
10.0%
a 1465
9.2%
u 1103
 
6.9%
o 1043
 
6.5%
v 1011
 
6.3%
l 840
 
5.3%
m 809
 
5.1%
i 788
 
4.9%
Other values (16) 3529
22.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15223
95.4%
Space Separator 588
 
3.7%
Uppercase Letter 153
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2052
13.5%
s 1731
11.4%
n 1593
10.5%
a 1465
9.6%
u 1103
 
7.2%
o 1043
 
6.9%
v 1011
 
6.6%
l 840
 
5.5%
m 809
 
5.3%
i 788
 
5.2%
Other values (12) 2788
18.3%
Uppercase Letter
ValueCountFrequency (%)
N 51
33.3%
O 51
33.3%
S 51
33.3%
Space Separator
ValueCountFrequency (%)
588
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15376
96.3%
Common 588
 
3.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2052
13.3%
s 1731
11.3%
n 1593
10.4%
a 1465
9.5%
u 1103
 
7.2%
o 1043
 
6.8%
v 1011
 
6.6%
l 840
 
5.5%
m 809
 
5.3%
i 788
 
5.1%
Other values (15) 2941
19.1%
Common
ValueCountFrequency (%)
588
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15964
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2052
12.9%
s 1731
10.8%
n 1593
10.0%
a 1465
9.2%
u 1103
 
6.9%
o 1043
 
6.5%
v 1011
 
6.3%
l 840
 
5.3%
m 809
 
5.1%
i 788
 
4.9%
Other values (16) 3529
22.1%

diagnosis_confirm_type
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing394
Missing (%)4.5%
Memory size136.4 KiB
serial imaging showing no change
6922 
histopathology
1333 
single image expert consensus
 
83

Length

Max length32
Median length32
Mean length29.092468
Min length14

Characters and Unicode

Total characters242573
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhistopathology
2nd rowhistopathology
3rd rowhistopathology
4th rowhistopathology
5th rowhistopathology

Common Values

ValueCountFrequency (%)
serial imaging showing no change 6922
79.3%
histopathology 1333
 
15.3%
single image expert consensus 83
 
1.0%
(Missing) 394
 
4.5%

Length

2023-08-15T14:02:52.511833image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:52.620725image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
serial 6922
19.1%
imaging 6922
19.1%
showing 6922
19.1%
no 6922
19.1%
change 6922
19.1%
histopathology 1333
 
3.7%
single 83
 
0.2%
image 83
 
0.2%
expert 83
 
0.2%
consensus 83
 
0.2%

Most occurring characters

ValueCountFrequency (%)
g 29187
12.0%
i 29187
12.0%
n 27937
11.5%
27937
11.5%
a 22182
9.1%
o 17926
7.4%
h 16510
6.8%
s 15509
6.4%
e 14259
5.9%
l 8338
 
3.4%
Other values (9) 33601
13.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 214636
88.5%
Space Separator 27937
 
11.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g 29187
13.6%
i 29187
13.6%
n 27937
13.0%
a 22182
10.3%
o 17926
8.4%
h 16510
7.7%
s 15509
7.2%
e 14259
6.6%
l 8338
 
3.9%
m 7005
 
3.3%
Other values (8) 26596
12.4%
Space Separator
ValueCountFrequency (%)
27937
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 214636
88.5%
Common 27937
 
11.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
g 29187
13.6%
i 29187
13.6%
n 27937
13.0%
a 22182
10.3%
o 17926
8.4%
h 16510
7.7%
s 15509
7.2%
e 14259
6.6%
l 8338
 
3.9%
m 7005
 
3.3%
Other values (8) 26596
12.4%
Common
ValueCountFrequency (%)
27937
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 242573
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g 29187
12.0%
i 29187
12.0%
n 27937
11.5%
27937
11.5%
a 22182
9.1%
o 17926
7.4%
h 16510
6.8%
s 15509
6.4%
e 14259
5.9%
l 8338
 
3.4%
Other values (9) 33601
13.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size76.7 KiB
False
5127 
True
3605 
ValueCountFrequency (%)
False 5127
58.7%
True 3605
41.3%
2023-08-15T14:02:52.718602image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

image_type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size136.4 KiB
dermoscopic
8625 
clinical
 
106

Length

Max length11
Median length11
Mean length10.963578
Min length8

Characters and Unicode

Total characters95723
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdermoscopic
2nd rowclinical
3rd rowdermoscopic
4th rowdermoscopic
5th rowclinical

Common Values

ValueCountFrequency (%)
dermoscopic 8625
98.8%
clinical 106
 
1.2%
(Missing) 1
 
< 0.1%

Length

2023-08-15T14:02:52.801477image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:52.900468image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
dermoscopic 8625
98.8%
clinical 106
 
1.2%

Most occurring characters

ValueCountFrequency (%)
c 17462
18.2%
o 17250
18.0%
i 8837
9.2%
d 8625
9.0%
e 8625
9.0%
r 8625
9.0%
m 8625
9.0%
s 8625
9.0%
p 8625
9.0%
l 212
 
0.2%
Other values (2) 212
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 95723
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 17462
18.2%
o 17250
18.0%
i 8837
9.2%
d 8625
9.0%
e 8625
9.0%
r 8625
9.0%
m 8625
9.0%
s 8625
9.0%
p 8625
9.0%
l 212
 
0.2%
Other values (2) 212
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 95723
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 17462
18.2%
o 17250
18.0%
i 8837
9.2%
d 8625
9.0%
e 8625
9.0%
r 8625
9.0%
m 8625
9.0%
s 8625
9.0%
p 8625
9.0%
l 212
 
0.2%
Other values (2) 212
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 95723
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 17462
18.2%
o 17250
18.0%
i 8837
9.2%
d 8625
9.0%
e 8625
9.0%
r 8625
9.0%
m 8625
9.0%
s 8625
9.0%
p 8625
9.0%
l 212
 
0.2%
Other values (2) 212
 
0.2%

lesion_id
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct8370
Distinct (%)98.8%
Missing262
Missing (%)3.0%
Memory size136.4 KiB
IL_6821221
 
6
IL_7051641
 
4
IL_2055980
 
4
IL_8955295
 
3
IL_6361168
 
2
Other values (8365)
8451 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters84700
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8279 ?
Unique (%)97.7%

Sample

1st rowIL_3797557
2nd rowIL_3211111
3rd rowIL_3949403
4th rowIL_3211111
5th rowIL_6961144

Common Values

ValueCountFrequency (%)
IL_6821221 6
 
0.1%
IL_7051641 4
 
< 0.1%
IL_2055980 4
 
< 0.1%
IL_8955295 3
 
< 0.1%
IL_6361168 2
 
< 0.1%
IL_5806562 2
 
< 0.1%
IL_3440737 2
 
< 0.1%
IL_2147560 2
 
< 0.1%
IL_5870994 2
 
< 0.1%
IL_6036150 2
 
< 0.1%
Other values (8360) 8441
96.7%
(Missing) 262
 
3.0%

Length

2023-08-15T14:02:53.152474image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
il_6821221 6
 
0.1%
il_2055980 4
 
< 0.1%
il_7051641 4
 
< 0.1%
il_8955295 3
 
< 0.1%
il_3664250 2
 
< 0.1%
il_1694184 2
 
< 0.1%
il_6841868 2
 
< 0.1%
il_5765588 2
 
< 0.1%
il_6961144 2
 
< 0.1%
il_0174675 2
 
< 0.1%
Other values (8360) 8441
99.7%

Most occurring characters

ValueCountFrequency (%)
I 8470
10.0%
L 8470
10.0%
_ 8470
10.0%
3 6055
 
7.1%
2 5984
 
7.1%
9 5963
 
7.0%
0 5957
 
7.0%
5 5942
 
7.0%
7 5925
 
7.0%
1 5905
 
7.0%
Other values (3) 17559
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 59290
70.0%
Uppercase Letter 16940
 
20.0%
Connector Punctuation 8470
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 6055
10.2%
2 5984
10.1%
9 5963
10.1%
0 5957
10.0%
5 5942
10.0%
7 5925
10.0%
1 5905
10.0%
6 5889
9.9%
8 5864
9.9%
4 5806
9.8%
Uppercase Letter
ValueCountFrequency (%)
I 8470
50.0%
L 8470
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8470
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 67760
80.0%
Latin 16940
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
_ 8470
12.5%
3 6055
8.9%
2 5984
8.8%
9 5963
8.8%
0 5957
8.8%
5 5942
8.8%
7 5925
8.7%
1 5905
8.7%
6 5889
8.7%
8 5864
8.7%
Latin
ValueCountFrequency (%)
I 8470
50.0%
L 8470
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 84700
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 8470
10.0%
L 8470
10.0%
_ 8470
10.0%
3 6055
 
7.1%
2 5984
 
7.1%
9 5963
 
7.0%
0 5957
 
7.0%
5 5942
 
7.0%
7 5925
 
7.0%
1 5905
 
7.0%
Other values (3) 17559
20.7%

mel_class
Categorical

Distinct2
Distinct (%)1.4%
Missing8584
Missing (%)98.3%
Memory size136.4 KiB
melanoma in situ
80 
invasive melanoma
68 

Length

Max length17
Median length16
Mean length16.459459
Min length16

Characters and Unicode

Total characters2436
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowinvasive melanoma
2nd rowmelanoma in situ
3rd rowmelanoma in situ
4th rowmelanoma in situ
5th rowmelanoma in situ

Common Values

ValueCountFrequency (%)
melanoma in situ 80
 
0.9%
invasive melanoma 68
 
0.8%
(Missing) 8584
98.3%

Length

2023-08-15T14:02:53.238768image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:53.335063image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
melanoma 148
39.4%
in 80
21.3%
situ 80
21.3%
invasive 68
18.1%

Most occurring characters

ValueCountFrequency (%)
a 364
14.9%
m 296
12.2%
n 296
12.2%
i 296
12.2%
228
9.4%
e 216
8.9%
l 148
6.1%
o 148
6.1%
s 148
6.1%
v 136
 
5.6%
Other values (2) 160
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2208
90.6%
Space Separator 228
 
9.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 364
16.5%
m 296
13.4%
n 296
13.4%
i 296
13.4%
e 216
9.8%
l 148
6.7%
o 148
6.7%
s 148
6.7%
v 136
 
6.2%
t 80
 
3.6%
Space Separator
ValueCountFrequency (%)
228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2208
90.6%
Common 228
 
9.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 364
16.5%
m 296
13.4%
n 296
13.4%
i 296
13.4%
e 216
9.8%
l 148
6.7%
o 148
6.7%
s 148
6.7%
v 136
 
6.2%
t 80
 
3.6%
Common
ValueCountFrequency (%)
228
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2436
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 364
14.9%
m 296
12.2%
n 296
12.2%
i 296
12.2%
228
9.4%
e 216
8.9%
l 148
6.1%
o 148
6.1%
s 148
6.1%
v 136
 
5.6%
Other values (2) 160
6.6%

mel_thick_mm
Real number (ℝ)

Distinct22
Distinct (%)22.4%
Missing8634
Missing (%)98.9%
Infinite0
Infinite (%)0.0%
Mean0.44183673
Minimum0
Maximum7.3
Zeros24
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size136.4 KiB
2023-08-15T14:02:53.413270image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.185
median0.3
Q30.4375
95-th percentile1.186
Maximum7.3
Range7.3
Interquartile range (IQR)0.2525

Descriptive statistics

Standard deviation0.80985979
Coefficient of variation (CV)1.832939
Kurtosis53.80721
Mean0.44183673
Median Absolute Deviation (MAD)0.135
Skewness6.6127432
Sum43.3
Variance0.65587288
MonotonicityNot monotonic
2023-08-15T14:02:53.501922image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
0 24
 
0.3%
0.4 11
 
0.1%
0.3 10
 
0.1%
0.2 9
 
0.1%
0.5 6
 
0.1%
0.8 4
 
< 0.1%
0.25 4
 
< 0.1%
1.9 4
 
< 0.1%
0.24 4
 
< 0.1%
0.32 3
 
< 0.1%
Other values (12) 19
 
0.2%
(Missing) 8634
98.9%
ValueCountFrequency (%)
0 24
0.3%
0.18 1
 
< 0.1%
0.2 9
 
0.1%
0.24 4
 
< 0.1%
0.25 4
 
< 0.1%
0.29 2
 
< 0.1%
0.3 10
0.1%
0.32 3
 
< 0.1%
0.35 3
 
< 0.1%
0.37 2
 
< 0.1%
ValueCountFrequency (%)
7.3 1
 
< 0.1%
1.9 4
< 0.1%
1.06 3
< 0.1%
1 1
 
< 0.1%
0.8 4
< 0.1%
0.71 1
 
< 0.1%
0.7 1
 
< 0.1%
0.55 1
 
< 0.1%
0.5 6
0.1%
0.47 2
 
< 0.1%

mel_type
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)16.7%
Missing8720
Missing (%)99.9%
Memory size136.4 KiB
superficial spreading melanoma
11 
lentigo maligna melanoma
 
1

Length

Max length30
Median length30
Mean length29.5
Min length24

Characters and Unicode

Total characters354
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)8.3%

Sample

1st rowlentigo maligna melanoma
2nd rowsuperficial spreading melanoma
3rd rowsuperficial spreading melanoma
4th rowsuperficial spreading melanoma
5th rowsuperficial spreading melanoma

Common Values

ValueCountFrequency (%)
superficial spreading melanoma 11
 
0.1%
lentigo maligna melanoma 1
 
< 0.1%
(Missing) 8720
99.9%

Length

2023-08-15T14:02:53.602523image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:53.699883image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
melanoma 12
33.3%
superficial 11
30.6%
spreading 11
30.6%
lentigo 1
 
2.8%
maligna 1
 
2.8%

Most occurring characters

ValueCountFrequency (%)
a 48
13.6%
e 35
9.9%
i 35
9.9%
l 25
 
7.1%
m 25
 
7.1%
n 25
 
7.1%
24
 
6.8%
s 22
 
6.2%
r 22
 
6.2%
p 22
 
6.2%
Other values (7) 71
20.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 330
93.2%
Space Separator 24
 
6.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 48
14.5%
e 35
10.6%
i 35
10.6%
l 25
7.6%
m 25
7.6%
n 25
7.6%
s 22
 
6.7%
r 22
 
6.7%
p 22
 
6.7%
g 13
 
3.9%
Other values (6) 58
17.6%
Space Separator
ValueCountFrequency (%)
24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 330
93.2%
Common 24
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 48
14.5%
e 35
10.6%
i 35
10.6%
l 25
7.6%
m 25
7.6%
n 25
7.6%
s 22
 
6.7%
r 22
 
6.7%
p 22
 
6.7%
g 13
 
3.9%
Other values (6) 58
17.6%
Common
ValueCountFrequency (%)
24
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 354
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 48
13.6%
e 35
9.9%
i 35
9.9%
l 25
 
7.1%
m 25
 
7.1%
n 25
 
7.1%
24
 
6.8%
s 22
 
6.2%
r 22
 
6.2%
p 22
 
6.2%
Other values (7) 71
20.1%

mel_ulcer
Boolean

Distinct2
Distinct (%)3.1%
Missing8668
Missing (%)99.3%
Memory size136.4 KiB
False
 
54
True
 
10
(Missing)
8668 
ValueCountFrequency (%)
False 54
 
0.6%
True 10
 
0.1%
(Missing) 8668
99.3%
2023-08-15T14:02:53.784781image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Distinct2
Distinct (%)0.2%
Missing7869
Missing (%)90.1%
Memory size136.4 KiB
True
 
718
False
 
145
(Missing)
7869 
ValueCountFrequency (%)
True 718
 
8.2%
False 145
 
1.7%
(Missing) 7869
90.1%
2023-08-15T14:02:53.872480image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

nevus_type
Categorical

IMBALANCE  MISSING 

Distinct5
Distinct (%)4.3%
Missing8616
Missing (%)98.7%
Memory size136.4 KiB
nevus NOS
104 
blue
 
5
combined
 
4
persistent/recurrent
 
2
spitz
 
1

Length

Max length20
Median length9
Mean length8.9051724
Min length4

Characters and Unicode

Total characters1033
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st rowblue
2nd rowpersistent/recurrent
3rd rowcombined
4th rownevus NOS
5th rownevus NOS

Common Values

ValueCountFrequency (%)
nevus NOS 104
 
1.2%
blue 5
 
0.1%
combined 4
 
< 0.1%
persistent/recurrent 2
 
< 0.1%
spitz 1
 
< 0.1%
(Missing) 8616
98.7%

Length

2023-08-15T14:02:53.960681image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:54.067520image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
nevus 104
47.3%
nos 104
47.3%
blue 5
 
2.3%
combined 4
 
1.8%
persistent/recurrent 2
 
0.9%
spitz 1
 
0.5%

Most occurring characters

ValueCountFrequency (%)
e 121
11.7%
n 112
10.8%
u 111
10.7%
s 109
10.6%
v 104
10.1%
104
10.1%
N 104
10.1%
O 104
10.1%
S 104
10.1%
b 9
 
0.9%
Other values (11) 51
4.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 615
59.5%
Uppercase Letter 312
30.2%
Space Separator 104
 
10.1%
Other Punctuation 2
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 121
19.7%
n 112
18.2%
u 111
18.0%
s 109
17.7%
v 104
16.9%
b 9
 
1.5%
r 8
 
1.3%
i 7
 
1.1%
t 7
 
1.1%
c 6
 
1.0%
Other values (6) 21
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
N 104
33.3%
O 104
33.3%
S 104
33.3%
Space Separator
ValueCountFrequency (%)
104
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 927
89.7%
Common 106
 
10.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 121
13.1%
n 112
12.1%
u 111
12.0%
s 109
11.8%
v 104
11.2%
N 104
11.2%
O 104
11.2%
S 104
11.2%
b 9
 
1.0%
r 8
 
0.9%
Other values (9) 41
 
4.4%
Common
ValueCountFrequency (%)
104
98.1%
/ 2
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1033
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 121
11.7%
n 112
10.8%
u 111
10.7%
s 109
10.6%
v 104
10.1%
104
10.1%
N 104
10.1%
O 104
10.1%
S 104
10.1%
b 9
 
0.9%
Other values (11) 51
4.9%

patient_id
Categorical

HIGH CARDINALITY  MISSING 

Distinct931
Distinct (%)11.0%
Missing262
Missing (%)3.0%
Memory size136.4 KiB
IP_7279968
 
115
IP_4382720
 
115
IP_0656529
 
114
IP_9147454
 
102
IP_6245507
 
102
Other values (926)
7922 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters84700
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique429 ?
Unique (%)5.1%

Sample

1st rowIP_4906546
2nd rowIP_1218261
3rd rowIP_1770335
4th rowIP_1218261
5th rowIP_1218261

Common Values

ValueCountFrequency (%)
IP_7279968 115
 
1.3%
IP_4382720 115
 
1.3%
IP_0656529 114
 
1.3%
IP_9147454 102
 
1.2%
IP_6245507 102
 
1.2%
IP_1139701 102
 
1.2%
IP_4419570 102
 
1.2%
IP_3057277 102
 
1.2%
IP_1969685 100
 
1.1%
IP_2153088 92
 
1.1%
Other values (921) 7424
85.0%
(Missing) 262
 
3.0%

Length

2023-08-15T14:02:54.151198image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ip_7279968 115
 
1.4%
ip_4382720 115
 
1.4%
ip_0656529 114
 
1.3%
ip_4419570 102
 
1.2%
ip_3057277 102
 
1.2%
ip_9147454 102
 
1.2%
ip_1139701 102
 
1.2%
ip_6245507 102
 
1.2%
ip_1969685 100
 
1.2%
ip_2153088 92
 
1.1%
Other values (921) 7424
87.7%

Most occurring characters

ValueCountFrequency (%)
I 8470
10.0%
P 8470
10.0%
_ 8470
10.0%
0 6388
 
7.5%
5 6330
 
7.5%
4 6207
 
7.3%
8 6196
 
7.3%
2 5908
 
7.0%
7 5891
 
7.0%
9 5745
 
6.8%
Other values (3) 16625
19.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 59290
70.0%
Uppercase Letter 16940
 
20.0%
Connector Punctuation 8470
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6388
10.8%
5 6330
10.7%
4 6207
10.5%
8 6196
10.5%
2 5908
10.0%
7 5891
9.9%
9 5745
9.7%
6 5725
9.7%
3 5580
9.4%
1 5320
9.0%
Uppercase Letter
ValueCountFrequency (%)
I 8470
50.0%
P 8470
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8470
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 67760
80.0%
Latin 16940
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
_ 8470
12.5%
0 6388
9.4%
5 6330
9.3%
4 6207
9.2%
8 6196
9.1%
2 5908
8.7%
7 5891
8.7%
9 5745
8.5%
6 5725
8.4%
3 5580
8.2%
Latin
ValueCountFrequency (%)
I 8470
50.0%
P 8470
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 84700
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 8470
10.0%
P 8470
10.0%
_ 8470
10.0%
0 6388
 
7.5%
5 6330
 
7.5%
4 6207
 
7.3%
8 6196
 
7.3%
2 5908
 
7.0%
7 5891
 
7.0%
9 5745
 
6.8%
Other values (3) 16625
19.6%
Distinct2
Distinct (%)< 0.1%
Missing19
Missing (%)0.2%
Memory size136.4 KiB
True
5256 
False
3457 
(Missing)
 
19
ValueCountFrequency (%)
True 5256
60.2%
False 3457
39.6%
(Missing) 19
 
0.2%
2023-08-15T14:02:54.247042image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

pixels_x
Real number (ℝ)

Distinct209
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5305.6093
Minimum85
Maximum7360
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.4 KiB
2023-08-15T14:02:54.339003image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum85
5-th percentile1920
Q16000
median6000
Q36000
95-th percentile6000
Maximum7360
Range7275
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1402.6993
Coefficient of variation (CV)0.26438044
Kurtosis2.474481
Mean5305.6093
Median Absolute Deviation (MAD)0
Skewness-1.9231805
Sum46328580
Variance1967565.4
MonotonicityNot monotonic
2023-08-15T14:02:54.447656image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6000 6564
75.2%
3264 605
 
6.9%
5184 544
 
6.2%
1920 325
 
3.7%
3072 154
 
1.8%
640 114
 
1.3%
3024 94
 
1.1%
4032 42
 
0.5%
3872 35
 
0.4%
2048 22
 
0.3%
Other values (199) 233
 
2.7%
ValueCountFrequency (%)
85 1
< 0.1%
166 1
< 0.1%
169 1
< 0.1%
170 1
< 0.1%
187 1
< 0.1%
190 1
< 0.1%
229 1
< 0.1%
239 1
< 0.1%
253 1
< 0.1%
257 1
< 0.1%
ValueCountFrequency (%)
7360 1
 
< 0.1%
6000 6564
75.2%
5472 1
 
< 0.1%
5184 544
 
6.2%
4608 2
 
< 0.1%
4128 1
 
< 0.1%
4032 42
 
0.5%
3872 35
 
0.4%
3390 1
 
< 0.1%
3333 1
 
< 0.1%

pixels_y
Real number (ℝ)

Distinct204
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3588.598
Minimum85
Maximum4912
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size136.4 KiB
2023-08-15T14:02:54.562950image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum85
5-th percentile1080
Q14000
median4000
Q34000
95-th percentile4000
Maximum4912
Range4827
Interquartile range (IQR)0

Descriptive statistics

Standard deviation879.74804
Coefficient of variation (CV)0.2451509
Kurtosis3.7582784
Mean3588.598
Median Absolute Deviation (MAD)0
Skewness-2.189584
Sum31335638
Variance773956.61
MonotonicityNot monotonic
2023-08-15T14:02:54.676859image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4000 6564
75.2%
2448 606
 
6.9%
3456 544
 
6.2%
1080 325
 
3.7%
2304 154
 
1.8%
480 115
 
1.3%
4032 94
 
1.1%
3024 41
 
0.5%
2592 36
 
0.4%
1536 22
 
0.3%
Other values (194) 231
 
2.6%
ValueCountFrequency (%)
85 1
< 0.1%
150 1
< 0.1%
170 1
< 0.1%
180 1
< 0.1%
186 1
< 0.1%
199 1
< 0.1%
203 1
< 0.1%
214 1
< 0.1%
228 1
< 0.1%
232 1
< 0.1%
ValueCountFrequency (%)
4912 1
 
< 0.1%
4032 94
 
1.1%
4000 6564
75.2%
3872 1
 
< 0.1%
3694 1
 
< 0.1%
3648 1
 
< 0.1%
3598 1
 
< 0.1%
3456 544
 
6.2%
3402 1
 
< 0.1%
3247 1
 
< 0.1%

sex
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size136.4 KiB
female
4419 
male
4312 

Length

Max length6
Median length6
Mean length5.0122552
Min length4

Characters and Unicode

Total characters43762
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowmale
3rd rowfemale
4th rowmale
5th rowmale

Common Values

ValueCountFrequency (%)
female 4419
50.6%
male 4312
49.4%
(Missing) 1
 
< 0.1%

Length

2023-08-15T14:02:54.785925image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:54.890754image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
female 4419
50.6%
male 4312
49.4%

Most occurring characters

ValueCountFrequency (%)
e 13150
30.0%
m 8731
20.0%
a 8731
20.0%
l 8731
20.0%
f 4419
 
10.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 43762
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 13150
30.0%
m 8731
20.0%
a 8731
20.0%
l 8731
20.0%
f 4419
 
10.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 43762
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 13150
30.0%
m 8731
20.0%
a 8731
20.0%
l 8731
20.0%
f 4419
 
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43762
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 13150
30.0%
m 8731
20.0%
a 8731
20.0%
l 8731
20.0%
f 4419
 
10.1%

mel_mitotic_index
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing8731
Missing (%)> 99.9%
Memory size136.4 KiB
0/mm^2

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6
Distinct characters5
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row0/mm^2

Common Values

ValueCountFrequency (%)
0/mm^2 1
 
< 0.1%
(Missing) 8731
> 99.9%

Length

2023-08-15T14:02:54.970697image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T14:02:55.056604image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0/mm^2 1
100.0%

Most occurring characters

ValueCountFrequency (%)
m 2
33.3%
0 1
16.7%
/ 1
16.7%
^ 1
16.7%
2 1
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2
33.3%
Decimal Number 2
33.3%
Other Punctuation 1
16.7%
Modifier Symbol 1
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1
50.0%
2 1
50.0%
Lowercase Letter
ValueCountFrequency (%)
m 2
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
66.7%
Latin 2
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1
25.0%
/ 1
25.0%
^ 1
25.0%
2 1
25.0%
Latin
ValueCountFrequency (%)
m 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 2
33.3%
0 1
16.7%
/ 1
16.7%
^ 1
16.7%
2 1
16.7%

Interactions

2023-08-15T14:02:48.863433image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:44.629026image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:45.619361image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.271405image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.898217image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.523509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:48.083518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:48.948548image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:44.846604image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:45.743119image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.360152image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.987741image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.601015image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:48.173491image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:49.036298image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:44.967319image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:45.831488image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.448358image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.068081image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.675405image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:48.265624image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:49.126746image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:45.098279image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:45.921708image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.539493image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.161998image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.753559image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:48.359838image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:49.218454image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:45.237936image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.006265image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.634357image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.258896image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.839342image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:48.454892image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:49.303807image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:45.357242image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.083995image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.716202image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.343352image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.921296image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:48.537314image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:49.396209image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:45.494113image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.181691image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:46.810215image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.437294image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:47.999458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-08-15T14:02:48.631611image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Missing values

2023-08-15T14:02:49.552080image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-08-15T14:02:49.882395image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-08-15T14:02:50.163666image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0isic_idattributioncopyright_licenseacquisition_dayage_approxanatom_site_generalbenign_malignantclin_size_long_diam_mmdermoscopic_typediagnosisdiagnosis_confirm_typefamily_hx_mmimage_typelesion_idmel_classmel_thick_mmmel_typemel_ulcermelanocyticnevus_typepatient_idpersonal_hx_mmpixels_xpixels_ysexmel_mitotic_index
00ISIC_3079785Hospital Italiano de Buenos AiresCC-BY-NCNaN45lower extremitybenignNaNcontact polarizednevushistopathologyFalsedermoscopicIL_3797557NaNNaNNaNNaNNaNNaNIP_4906546False640480maleNaN
11ISIC_2107859Hospital Italiano de Buenos AiresCC-BY-NCNaN65head/neckmalignantNaNNaNmelanomahistopathologyFalseclinicalIL_3211111NaNNaNNaNNaNNaNNaNIP_1218261True12742620maleNaN
22ISIC_3443621Hospital Italiano de Buenos AiresCC-BY-NCNaN85head/neckbenignNaNcontact polarizedsolar lentigohistopathologyFalsedermoscopicIL_3949403NaNNaNNaNNaNNaNNaNIP_1770335False640480femaleNaN
33ISIC_2368449Hospital Italiano de Buenos AiresCC-BY-NCNaN65head/neckmalignantNaNcontact polarizedmelanomahistopathologyFalsedermoscopicIL_3211111NaNNaNNaNNaNNaNNaNIP_1218261True14883059maleNaN
44ISIC_0094098Hospital Italiano de Buenos AiresCC-BY-NCNaN65posterior torsomalignantNaNNaNmelanomahistopathologyFalseclinicalIL_6961144NaNNaNNaNNaNNaNNaNIP_1218261True855661maleNaN
55ISIC_1452632Hospital Italiano de Buenos AiresCC-BY-NCNaN65posterior torsomalignantNaNcontact polarizedmelanomahistopathologyFalsedermoscopicIL_6961144NaNNaNNaNNaNNaNNaNIP_1218261True40321960maleNaN
66ISIC_9098311Hospital Italiano de Buenos AiresCC-BY-NCNaN65head/neckmalignantNaNNaNmelanomahistopathologyFalseclinicalIL_6841868NaNNaNNaNNaNNaNNaNIP_5804995True23582178maleNaN
77ISIC_8786983Hospital Italiano de Buenos AiresCC-BY-NCNaN65head/neckmalignantNaNcontact polarizedmelanomahistopathologyFalsedermoscopicIL_6841868NaNNaNNaNNaNNaNNaNIP_5804995True40323024maleNaN
88ISIC_8793798Hospital Italiano de Buenos AiresCC-BY-NCNaN60NaNmalignantNaNNaNmelanomahistopathologyFalseclinicalIL_3601366NaNNaNNaNNaNNaNNaNIP_0240453True239199femaleNaN
99ISIC_4326975Hospital Italiano de Buenos AiresCC-BY-NCNaN60NaNmalignantNaNcontact polarizedmelanomahistopathologyFalsedermoscopicIL_3601366NaNNaNNaNNaNNaNNaNIP_0240453True435441femaleNaN
Unnamed: 0isic_idattributioncopyright_licenseacquisition_dayage_approxanatom_site_generalbenign_malignantclin_size_long_diam_mmdermoscopic_typediagnosisdiagnosis_confirm_typefamily_hx_mmimage_typelesion_idmel_classmel_thick_mmmel_typemel_ulcermelanocyticnevus_typepatient_idpersonal_hx_mmpixels_xpixels_ysexmel_mitotic_index
100033595ISIC_4608393The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070upper extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_3429390NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100043596ISIC_2362404The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070upper extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_6417209NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100053597ISIC_8285182The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070upper extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_9141269NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100063598ISIC_2638179The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070lower extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_6768019NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100073599ISIC_6284722The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070lower extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_8087895NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100083600ISIC_8732572The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070lower extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_1017997NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100093601ISIC_0707446The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070lower extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_2580397NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100103602ISIC_8021069The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070lower extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_4724384NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100113603ISIC_6721280The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070lower extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_0309191NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN
100123604ISIC_9354947The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research CentreCC-BY394.070lower extremitybenignNaNcontact non-polarizedNaNserial imaging showing no changeTruedermoscopicIL_6859221NaNNaNNaNNaNNaNNaNIP_7651325True60004000femaleNaN